Concept-based Recommendations for Internet 

Advertisement 



Dmitry I. Ignatov and Sergei O. Kuznetsov 

Higher School of Economics, Department of Applied Mathematics 
Kirpichnaya 33/5, Moscow 105679, Russia 
{dignatov, skuznetsovjOhse.ru 



Abstract. The problem of detecting terms that can be interesting to 
the advertiser is considered. If a company has already bought some ad- 
vertising terms which describe certain services, it is reasonable to find 
out the terms bought by competing companies. A part of them can be 
recommended as future advertising terms to the company. The goal of 
this work is to propose better interpretable recommendations based on 
FCA and association rules. 

1 Introduction 

Contextual Internet advertising is a form of e-commerce. The largest revenues 
of the major players at this market, like search systems, are obtained from the 
so-called search sensitive advertisement, i.c, advertisement in a sense close to 
user queries. Here we consider the problem of detecting terms that can be in- 
teresting to an advertiser. Assume that a company F has already bought some 
advertising terms which describe certain services. As a rule, there arc already 
competing companies at the market, therefore it is reasonable to find terms 
bought by them. These terms can be compared to those bought by F and part 
of them can be recommended as future advertising terms to F. The goal of this 
work is to propose well-interpretable recommendations based on FCA. The rest 
of the paper is organized as follows: First we recall main definitions from FCA 
and rule mining. Then we consider experimental data and the problem state- 
ment. Afterwards, we propose morphology-based and ontology-based metarules 
that can be derived without experimental data. We conclude the paper with 
experiments and their discussion. 

2 Main definitions 

First, we recall some basic notions from Formal Concept Analysis (FCA) [1]. 
Let G and M be sets, called the set of objects and attributes, respectively, and 
let / be a relation I C G x M: fov g G G, m G M, gim holds iff the object 
g has the attribute m. The triple K = (G, M, I) is called a (formal) context. If 
A C G, B C M are arbitrary subsets, then the Galois connection is given by the 
following derivation operators: 



A' = {m e M I gim for all g <^ A}, 
B' = {g&G\ gIm for all m G B}. 

If we have several contexts derivative operator of a context (G, M, I) denoted 
by (.)'• 

The pair {A,B), where A C G, B C M, A' = B, and B' = A is called a 
(formal) concept ( of the context K ) with extent A and intent B (in this case we 
have also A" = A and B" = B). For B,D C M the implication B ^ D holds if 
B' C D'. 

In data mining applications, an element of M is called an item and a subset 
of M is called an itemset. 

The support of a subset of attributes (an itemset) P C 7\f is defined as 
supp{P) = \P'\. An itemset is frequent if its support is not less than a given 
minimum support (denoted by minsupp) . An itemset P is closed if there exists 
no proper superset with the same support. The closure of an itemset P (denoted 
by P") is the largest superset of P with the same support. The task of frequent 
itemset mining consists of generating all (closed) itemsets (with their supports) 
with supports greater than or equal to a specified min_supp. An association rule 
is an expression of the form /i I2, where /i and I2 are arbitrary itemsets 
{h,l2 C A), 7i n 72 = and I2 ^ 0. The left side, 7i is called antecedent, the 
right side, I2 is called consequent. The support of an association rule r : Ii ^ I2 
^ is defined as: supp{r) = supp{Ii U 72). The confidence of an association rule r: 
7i — > 72 is defined as the conditional probability that an object has itemset 72, 
given that it has itemset 7i: conf{r) = supp{Ii U I2) / supp(Ii) . An association 
rule r with conf{r) = 100% is an exact association rule (or implication [1]), 
otherwise it is an approximate association rule. An association rule is valid if 
supp(r) > min_supp and conf(r) > min_conf. An itemset P is a generator if 
it has no proper subset Q{Q C P) with the same support. Let FCI be the set 
of frequent closed itemsets and let FG be the set of frequent generators. The 
informative basis for approximate association rules: IB = {r ■ g ^ {f \ g)\f € 
FGI A g e FG A g" C /}. 

3 Initial Data and Problem Statement 

For experimentation we used data of US Overture [2], which were first trans- 
formed in the standard context form. In the resulting context K = (G, M, I) 

objects from G stay for advertising companies (advertisers) and attributes from 
M stay for advertising terms (bids), gIm means that advertiser g bought term 
m. In the context \G\ = 2000, |M| = 3000, |7| = 92345. 

In our context, the number of attributes per object is bounded as follows: 
13 < l^'l < 947. For objects per attribute we have 18 < |m'| < 159. From 

^ In this paper we use absolute values, but the support of an association rule r is also 
often defined as supp{r) = supp{Ii U /2)/|0|- 



this context one had to compute formal concepts of the form (advertisers, bids) 
that represent market sectors. Formal concepts of this form can be further used 
for recommendation to the companies on the market, which did not buy bids 
contained in the intent of the concept. In other words, empty cell (<?,m) of the 
context can be considered as a recommendation to advertiser g to buy bid m, 
if this advertiser bought other bids contained in the intent of any concept. This 
can also be represented as association rules of the form "If an advertiser bought 
bid a, then one can recommend this advertiser to buy term 6" See [3] for the use 
of association rules in recommendation systems. 

We consider the following context: K^t = {F,T,Ipt), where F is the set 
of advertising firms (companies), T is the set of advertising terms, or phrases, 
flprt means that firm f G F bought advertising term t GT. 

For constructing recommendations we used the following approaches and 
tools: 

1. D-miner algorithm for detecting large market sectors as concepts; 

2. Coron system for constructing association rules; 

3. Construction of association metarules using morphological analysis; 

4. Construction of association metarules using ontologies (thematic catalogs). 

4 Standard approach to rule mining 

4.1 Detecting large market sectors with D-miner. 

D-miner is a freely available tool [4], [5] which constructs the set of concepts 
satisfying given constraints on sizes of extents and intents (icebergs and dual 
icebergs). D-miner takes as input a context and two parameters: minimal ad- 
missible extent and intent sizes and outputs a "band" of the concept lattice: 
all concepts satisfying constraints given by parameter values {\intent\ > m and 
\extent\ > n, where m, n e N, see table 1). 



Table 1. D-miner results. 
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Fig. 1. A concept lattice and its band output by D-miner. 



We give examples of intents of formal concepts for the case \L\ = 53, where 
L is a concept lattice. 
Hosting market. 

{ afTordable hosting web, business hosting web, cheap hosting, cheap hosting site 
web, cheap hosting web, company hosting web, cost hosting low web, discount host- 
ing web, domain hosting, hosting internet, hosting page web, hosting service, hosting 
services web, hosting site web, hosting web }. 

Hotel market. 

{ angeles hotel los, atlanta hotel, baltimore hotel, dallas hotel, denver hotel, hotel 
Chicago, diego hotel san, francisco hotel san, hotel houston, hotel miami, hotel new Or- 
leans, hotel new york, hotel orlando, hotel Philadelphia, hotel Seattle, hotel Vancouver} 

Distance communication market. 

{ call distance long, calling distance long, calling distance long plan, carrier 
distance long, cheap distance long, company distance long, company distance 
long phone, discount distance long, distance long, cheap calling distance long, 
distance long phone, distance long phone rate, distance long plan, distance long 
provider, distance long rate, distance long service } 

Weight loss drug market. 

{ adipex buy, adipex online, adipex order, adipex prescription, buy didrex, 
buy ionamin, ionamin purchase, buy phentermine, didrex online, ionamin on- 
line, ionamin order, online order phentermine, online phentermine, order phen- 
termine, phentermine prescription, phentermine purchase } 

4.2 Recommendations based on association rules. 

Using the Coron system (see [6]) we construct the informative basis of association 
rules [7]. We have chosen the informative basis, since it proposes a compact and 



effective way of representing the whole set of association rules. The results are 
given in table 2. 



Table 2. Properties of informative basis. 



min_supp 


max_supp 


rnin_conf 


max_conf 


number of rules 


30 


86 


0,9 


1 


101 391 


30 


109 


0,8 


1 


144 043 



Here are some examples of association rules. 

— {evitamin} {cvitamin} supp=31 [1.55%]; conf=0.861 [86.11%] 

— {gift graduation} {anniversary gift}, supp=41 [2.05%]; conf=0.820 
[82.00%]; 

The value supp = 31 of the first rule means that 31 companies bought phrases 
"e vitamin" and "c vitamin". The value conf = 0.861 means that 86,1% com- 
panies that bought the phrase "e vitamin" also bought the phrase "c vitamin" . 

To make recommendations for each particular company one may use an ap- 
proach proposed in [3]. For company / we find all association rules, the an- 
tecedent of which contain all the phrases bought by the company, then we con- 
struct the set Tu of unique advertising phrases not bought by the company / 
before. Then we order these phrases by decreasing of confidence of the rules 
where the phrases occur in the consequences. If buying a phrase is predicted by 
several rules (i.e., the phrase is in the consequences of several rules), we take the 
largest confidence. 

5 Mining metarules 

5.1 Morphology-bcised Metcirules 

Each attribute of our context is either a word or a phrase. Obviously, synonymous 

phrases arc related to same market sectors. The advertisers companies have 
usually thematic catalogs composed by experts, however due to the huge number 
of advertising terms manual composition of catalogs is a difficult task. Here we 
propose a morphological approach for detecting similar bids. 

Let t be an advertising phrase consisting of several words (here we disregard 
the word sequence): t = {101,^)2, . . . , w„}. A stem is the root or roots of a word, 
together with any derivational affixes, to which inflectional affixes arc added [8]. 
The stem of word Wi is denoted by Sj = stem{wi) and the set of stems of words 
of the phrase t is denoted by stem{t) = \Jstem{wi), where Wi G t. Consider the 

i 

formal context Kts = {T, S, Its), where T is the set of all phrases and S is the 



set of all stems of phrases from T, i.e. S = \Jstem{ti). Then tis denotes that 

i 

the set of stems of phrase t contains s. 

In this context we construct rules of the form t .s[^*' for all t <eT, where 
{■Y*' denotes the prime operator in the context Kts- Then the a of the context 
Kts (we call it a metarule, because it is not based on experimental data, but 
on implicit knowledge resided in natural language constructions) corresponds to 

FT J 

t > , an association rule of the context 'KpT = {F,T,Ipt)- If the values 

of support and confidence of this rule in context M.pT do not exceed certain 
thresholds, then the association rules constructed from the context are 
considered not very interesting. 



Table 3. A toy example of context K^t for "long distance calling" market. 
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Table 4. A toy example of context Kts for "long distance calling" market. 
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Metarules of the following forms seem also to be reasonable. First, one can 
look for rules of the form t > (J s^"^®, i.e., rules, the consequent of which con- 

i 

tain all terms containing at least one word with the stem common to a word in 
the antecedent term. Obviously, constructing rules of this type may result in the 
fusion of phrases related to different market sectors, e.g. "blackjack" and "black 
coat". Second, we considered rules of the form t > (U**) > i-^-' rules with 



the consequent with the set of stems being the same as the set of stems of the 

FT 

antecedent. Third, we also propose to consider metarules of the form ti > t2, 

where t^^^ C . These arc rules with the consequent being sets of stems that 
contain the set of stems of the antecedent. 



ExEimple of metcirules. 

- t ^ sl^^ 

{last minute vacation} — > {last minute travel} 
Supp= 19 Conf= 0,90 

- t^ijsl^' 

i 

{mail order phentermine} {adipex online order, adipex order, . . . , 
phentermine prescription, phentermine purchase, phentermine sale} 
Supp= 19 Conf= 0,95 



{distance long phone} {call distance long phone, . . . , 

carrier distance long phone, distance long phone rate, distance long phone service} 
Supp= 37 Conf= 0,88 

- ti^t2, ti'''' c 

{ink jet} {ink}, Supp= 14 Conf= 0,7 



5.2 Constructing ontologies cind ontology-bcised metarules. 

Here we use simple tree-like ontologies, where the closeness to the root of a tree 
defines generality of ontology concepts, which are advertisement phrases. For 
example, we use a manually constructed WordNet-like ontologies of market sec- 
tors. In our ontology of the pliarniacxnitical market the concept "pharmaceutical 
product" is more general than that of "vitamin." We introduce two operators 
acting on the set of advertising words T. Generalization operator gi{.) : T ^ T 
takes a conc;ept to a more general concept i levels higher in the generality or- 
der. Neighborhood operator n{.) : T ^ T takes a concept to the set of sibling 
concepts. 

Now we define two types of metarules for ontology: a generalization rule 
t gi{t) and a neighborhood rule t n{t). These rules can also be consid- 
ered as association rules of the context = {F,T, Ipx), which allows one to 
understand wliic;li of them are good supported by data. 
Examples of metarules for pharmaceutical market. 
Rule of the form t n{t), where t = ''B^VITAMIN" . 

{B .VITAMIN} {B.COMPLEX. VITAMIN, B12.VITAMIN, C. VITAMIN, . . . 
D. VITAMIN, DISCOUNT .VITAMIN, E.VITAMIN, MINERAL.VITAMIN, . . . 
MULTIVITAMIN, SUPPLEMENT.VITAMIN, VITAMIN} 
Rules of the form t c/i(t), where t = ''BJ/ITAMIN",gi{t) = "VITAMINS". 
{B^VITAMIN} {VITAMINS}. 



6 Experimental Validation 



For validation of association rules we used an adapted version of cross-validation. 
The training set was randomly divided into 10 parts, 9 of which were taken as 

tr 

the training set and the remaining part was used as a test set. By A — > B we 
denote an association rule generated on a training context. The confidence of 
this association rule measured on the test set, i.e., 

test 1^^'"* nE^'"*| 
conf{A >B) = ^^^^-^ 

shows the relative amount of companies that bought phrase B having bought 

phrase A. 

We constructed 10 sets of association rules for 10 different training sets 1800 
companies each (with minsupp = 1, 5% and miri-conf = 90%. The aggregated 
quality measure of the obtained rules is the average confidence: 

E confiA ^ B) 

/./ T-» 7 \ A — ^B^Rules 

average-Conj{Rulesi) = — ■ , 

where Rulesi is the set of association rules obtained on the i-th training set. We 
also considered rules with miri-conf > 0.5 and computed averaged confidence, 

average-conf {Rulesi) 

which was again averaged over 10 cases, average-conf = — . 



Table 5. Results of cross-validation for association rules. 
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92787 


0,92 



The confidence of rules averaged over the test set is almost the same as the 
miri-conf for the training set, i.e., (0,9 — 0, 87)/0,9 w 0,03. 



We used confidence measure also for validation of metarules. Support does 
not have much importance here, since we do not look for large markets or 
mostly sellable phrases, but stable dependencies of purchases. So, we consid- 
ered only rules with confidence larger than 0.8 (or 0.9). Confidence and support 
for metarules are computed for the context 'Kft = [F.T.Ift)- We present the 
values of confidence and support in the tables for morphology-based metarules. 



Table 6. Average support and average confidence for morphology-based metarules. 
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FT 
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t ti, such that C t'^s 


15 


0,49 
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t^ljti, such that tj'^^ C t''^^ 


11 


0,36 


2006 



We set the minimal support 0,5 and compute the number of rules of each 
group for which this threshold is exceeded. Table 5 shows that average_conf of 
these metarules is actually much higher (about 0,9). 



Table 7. Average supp and conf for morphological metarules for min.conf = 0, 5. 



Rule types 


Average supp 
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20 


0,69 
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From tables 6 and 7one can easily see that most confident and supported 

FT 

rules are of the form t * [jU. Note that the use of morphology is completely 

i 

automated and allows one to find highly plausible metarules without data on 



purchases. The rules with low support and confidence may be tested against 
recommendation systems such as Google AdWords, which uses the frequency 
of queries for synonyms. For validation of ontological rules we used Google ser- 
vice AdWords. 90% of recommendations (words) were contained in the list of 
synonyms output by AdWords. 

7 Conclusion and further work 

The obtained results show that a part of dependencies in databases for purchases 
of advertisement phrases may be detected automatically, with the use of stan- 
dard means of computer linguistics. Along with methods of data mining, these 
approaches allows one to improve recommendations and propose good means 
of ranking, which is very important for making Top-N recommendations. An- 
other advantage of the approach consists in the possibility of detecting related 
advertisement phrases not given directly in data. Results of FCA-based bicluster- 
ization show the possibility of detecting relatively large advertisement markets 
(with more than 20 participants) given by companies and advertising phrases. 
To improve the proposed approach we plan to use well-developed ontologies like 
WordNet for constructing ontology-based metarules. 
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